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1 »  Introduce  t  ion 

We  study  tests  of  hypotheses  for  a  regression  parameter  within  the 
context  of  the  box  and  Cox  (1964)  power  transformation  family.  The  model 
is  given  by 


y(A) 

i 

“  X .  0  +  0  c . 

1  1 

♦  1»  • • • ,  N  , 

x.3 

y 

■  (v .  w .  ) (  1 ) 
1  1  vY2' 

,  (w^),Y2  are  scalars 

y(A) 

-  (YX-I)/A 

if  A  4  0 

=  lop,  Y 

if  X  »  0  . 

here  o  is  the  standard  deviation  and  e^,  .  are  independently 

and  identically  distributed  with  mean  zero  and  variance  one.  We  arc 
interested  in  testing  the  hypothesis 

( 2 )  Hq:  y2  -  0  . 

In  what  follows,  A  will  be  an  estimate  of  X  ,  0*  ^  (y*,Y^)  will  be 
the  least  squares  estimates  in  the  estimated  scale  X*  and  0  *  (y  ^  9y  ) 
will  be  the  least  squares  estimates  calculated  in  the  true  but  unknown 
scale  X  .  A  substantial  literature  now  exists,  although  there  has  been 
no  real  emphasis  on  the  hypothesis  testing  problem  (2);  see  Andrews  (1971), 
Atkinson  (1973),  Hinklcy  (1975),  Carroll  (1980),  Bickel  and  Doksum  (1980), 
and  Carroll  and  Ruppert  (1980)  as  a  subset  of  this  literature. 

Of  course,  if  the  errors  arc  normally  distributed,  the  obvious  method 

for  testing  (2)  is  the  likelihood  ratio  test  or  LRT.  Equivalent  to  this  in 

large  samples  is  the  Wald  test  WT  which  is  based  on  y*  divided  by  an 

appropriate  estimate  of  its  staudard  error.  In  practice  what  is  most  often 

* 

done  is  to  select  the  scale  X  and  then  do  the  usual  analysis  in  this 
scale,  i.e.,  divide  y*  by  the  usual  formula  for  its  estimated  standard 
error.  We  denote  such  an  analysis  by  CT  ,  for  conditional  test  based  on 
the  estimated  scale  X  .  llinkley  (1975)  was  apparently  the  first  to 
recognize  that  these  tests  are  not  all  equivalent. 


,  %  \ 
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Example  11  1  .  Consider  t lie  location  model  for  log-normal  da ta  with 
A  =  0,  o  =  i  and  normal  errors  following 


log  Y.  «  ii  h  c. 

l  l 


I  /2  * 

Estimate  A  by  the  normal  theory  MLK.  Then  N  (\i  -  p)  is  asymptotically 

2  2 

normally  distributed  with  mean  zero  and  variance  1  +  (1  +  \i  )  /6  •  Thus, 
in  testing  H  :  v  -  0  ,  t  ho  test  CT  which  rejects  when  N*^jp*/o*J  exceeds 
the  normal  l-tc/2  percentile  always  has  a  higher  level  than  the  desired 
level  ft  i  at  least  asymptotically.  j— j 


However,  in  some  cases  the  test  CT  is  quite  good.  Bickel  and 
Doksum  (1980)  recognized  this  for  the  case  of  simple  linear  regression 
through  the  origen.  We  present  an  illustrative  example. 

44  . 

Example  2.  Consider  log-normal  simple  linear  regression 
with  A  »  0,  o  *  1  : 


log  V.  -  «04B,  x.  .  t.  . 

I  N  k 

N~  I  X.  «  V  ,  k  -  1, 2, 3, 4 
I 


P,  "  °*  W2  “  1  • 


In  this  important  special  case,  it  is  possible  to  show  by  very  detailed 

•  *  1 /2  * 
likelihood  calculation/;  that  N  (£j  -  0^)  is  asymptotically  normally 

distributed  with  mean  zero  and  variance 

(3)  s2(P,)  -  1  +4Pj2  (Po  + P1u3/2)2(6+8P,2+  6,4  (u4-I-M3))  '  . 

I  /2  * 

Of  course,  N  (6j  0^)  has  limit  variance  equal  to  one.  The  test 

CT  rejects  H  :  0  ■  0  whenever 

o  I 

H,/2|p*  I  /a*  >  t(a,N-2)  , 

where  t(a, N-2)  is  the  two-sided  t-perccntile.  The  Wald  test  WT 


«  \ 


asymptotically 

2 

(a)  LI  ie  tost  CT  liar*  the  correct  level  ,  since  s  (0)  «  I  , 

(b)  the  test  CT  has  higher  power  than  the  Vr  or  LRT. 

o 

The  theory  developed  by  Bickel  and  Doksum  (1980)  and  outlined  in 

the  next  section  suggests  that  when  the  naive  test  CT  has  the  correct 

level,  it  will  also  have  better  power  than  the  WT  or  LRT.  That  such 

a  phenomenon  is  possible  is  not  too  surprising  in  view  of  the  fact  that 

when  8*0  and  o*l  are  known,  the  statistical  curvature  (Efron  (1975)) 
•  .  .  2  2/3 

for  estimating  X  is  y  &  10  when  A»0  ;  Efron  suggests 

2  ° 
yQ  ±  1/8  is  large ! 

The  purpose  of  this  article  is  to  describe  situations  in  which  the 
test  CT  has  the  correct  level;  this  we  do  in  Section  3.  Basically,  we 
require  only  that  the  {w^}  be  independent  of  the  {v.}  •  *  situation 
that  obtains  in  many  important  models  including  simple  linear  regression 
and  balanced  factorial  designs. 


-  4  - 

2 .  Sma 1 1  o  Asymptotics 

Assume  that  the  errors  {c^}  arc  symmetrically  distributed.  Bickel 
and  Doksum  (1980)  find  major  technical  simplifications  by  letting  a-*0 
as  N  ►  .  Define 

•  •  *  •  » I  « 

A  -  (Xj  ...  xN)  V  -  A (A  A)  A 

Q  ‘  d  r.  (d(  ...  dN) 

“  (X  2((v.  -  I)  -  v.  lof,  (v.))) 
v.  -  1  ♦  X  x.  6 

i  l 

-I  • 

N  oM  =  N  (dd  -  d!M  )  >  E  >  0  . 

N 

1  /2  * 

From  standard  regression  theory,  we  know  that  N  is 

asymptotically  normally  distributed  with  mean  r.ero  and  covariance 

(4)  lQm  1  im  (N  *A  A)  1 

N*- 

I  /2  * 

Bickel  and  Doksum  show  that  N  (0  -  B) /a  has  asymptotic  covariance 

(5)  I,-  l*  Urn  U~lqq,/eH  . 

Suppose  that  x,  •  (v,  w.)  with  w.  scalar  and  v.  a  (1  * p)  vector. 

ir  111  i  i  ' 

Define  n  *  (0  ...  0  I)  .  Then  asymptotically  the  test  CT  rejects 

H  :  *  0  if 

o  2 

Nl/2|  n  n  ^  m  )  1/2  >  t(a,  N-p-l)  . 

Equations  (4)  and  (5)  tell  us  that  CT  has  the  correct  level  if 

(6)  n(J*  ”1)0  “0  when  H  ;  y.  *  0  obtains. 

I  o  o2 

Further,  since  QQ*  is  positive  semidefinite ,  CT  will  have  power  at 
least  as  large  as  WT  or  LKT  when  (6)  obtains.  Another  example 
illustrates  this. 


- 


-  5  - 


Example  ^3,  We  cousitler  the  small  o  asymptotics  for  the  two  group 
analysis  of  covariance  model  when  X  «  0  : 

log  V.  -  y  +  P  s,  +  P  X.  +  o  e.  ,  i«lv  ...f  2N 

S  J  *=  1  •  s2c“1  *  s3»l  ,  «  -1 ,  ...  S2NC“*  * 

The  parameter  p  is  tho  treatment  effect,  and  we  arc  interested  in 
testing 


H  :  p  =  0  . 

o 

Let  E  be  as  above  and  set 


2N 

l  X.  - 
1  1 

2N 

l  s.X. 

o 

2N  9 

l  x.2  *=■  2N 

1  1 

2N 

y  s.X .  •  2Nb 

-  2  Ha  , 

1  1  1 

1  1  1 

?N  - 

l  x.J 

1  1 

-  2Nc  . 

- 

1 


We  will  show  that  the  test  CT  has  the  correct  level  when  the  design 

is  balanced  over  the  two  treatments  in  the  first  two  moments  of  {X.}  , 

1/2  1 

i.e.,  a-b«=0  .  When  X  «  0  is  known,  N  (p-p)/a  has  asymptotic  variance 

2-1  .  . 

(1-a  )  .  Estimating  A  by  maximum  likelihood,  we  find  that 

1  /  2  * 

N  (p  -p)/o  has  asymptotic  variance 


s2  -  O-a2)"1  +  (AEd-a2)2)”1  /  (02  -  2ap  B)b  -  82a  c  \ 


+  2pp()-a‘) 


Hence  the  test  CT  has  the  correct  level  if  a  *  b  ■  0 


□ 


-  %.* 
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4  .  The  1 . * ■  vcj  of_  t lie  let; L  ( *T 
It- 

In  Example  3  we  saw  that  when  X*0  and  the  covnnates  are 
independent  of  the  treatment  assignment,  then  the  test  CT  has  the 
correct  level.  A  generalization  of  this  to  the  model  (1)  and  hypothesis (2) 
might  then  need  {v .}  to  he  independent  of  {w^}  *  In  following 
we  state  conditions  which  formalize  this  notion  and  are  at  the  same 
time  not  restricted  to  the  MLE  X*  as  an  estimate  of  X  .  Our 
assumptions  are  stated  in  such  a  fashion  as  to  allow  the  design  {x^} 
to  also  be  randomly  generated. 

Assumption  ^  1  .  X  is  root-N  consistent,  i.c.,  N^^(X-X)  «  0^(1)  • 

it  1  /2  *  .  . 

Assumption  2^  N  (£<  -  ft)  is  asymptotically  normally  distributed. 

.  i\  *  .  *  n 

As sump 1 3 on  3.  There  exists  o  with  o  /o  — >  1  . 

Assumption  ^4.  For  the  sequence  {w^}  , 

N 

N  [  w.  - — >  0 

i  1 

.  n  , 

N  l  w.2  -->  1  , 

I  1 

almost  surely  if  the  {w^}  are  random. 

Assumption  ^5.  Let  F^(v,w,e)  be  the  empirical  distribution  function 
of  {(x.,c.)  -  (v . ,w. , e . ) }  .  Suppose  there  exists  distribution  functions 

l  i  i  l  l 

F j , and  F^  such  that 

FN(v,w,e)  ----- >  Fj(v)F2(w)F3(e) 

almost  surely.  Further,  ((x^,r^)}  are  uniformly  bounded. 


-  7  - 

4t  jt 

Theorem.  Make  the  Assumptions  v  I  -  ^"5,  Then  under  11  !  ■  0  the 

|  /2  A  A  |  Ifr  'K*  o  z 

statistics  and  N  7  Yj  /o  are  each  asymptotically  normally 

distributed  with  mean  zero  and  variance  one.  Hence,  the  test  CT  has 

the  correct  asymptotic  level.  q 

Note  that  the  theorem  is  stated  only  in  terms  of  the  design,  as 
long  as  we  have  appropriate  estimates  of  A  and  o  .  The  value  of 
A  itself  is  not  important,  in  contrast  to  Examples  1  -  ^3. 

Some  comment  on  the  assumptions  is  m  order.  Assumption  !  is 

* 

crucial,  but  obviously  A  need  not  be  the  normal  theory  MLE  ;  see 
Hinkley  (1975),  Carroll  (1980)  and  bickol  and  Doksum  (1980)  for  other 
suggestions.  Assumption  ^  4  will  hold  if  there  is  an  intercept;  otherwise 

it  seems  necessary,  as  the  work  of  Bickcl  and  Doksum  indicates  for  simple 

.  .  .  > 

regression  through  the  origen.  Assumptions  2-  ^3  are  hardly  onerous. 

The  only  restrictive  assumption  is  #5.  The  boundedness  is  needed 
only  in  a  technical  sense  to  make  the  proof  fairly  easy.  Also,  one  of  the 
most  common  assumptions  in  regression  is  that  the  errors  { c . }  are 

1  .  Jr 

independent  of  the  design  (x^  B  (v^w.)}  .  The  heart  of  Assumption  ^*5 
is  thus  the  requirement  that  {v^}  be  unrelated  to  {w. }  . 

When  is  this  requirement  satisfied?  It  certainly  holds  for  simple 
linear  regression  under  Assumption  ^4,  either  with  or  without  an  origen. 
More  importantly,  it  holds  for  balanced  k*2  designs  where  the  test 
is  for  the  treatment  effect.  Another  important  example  in  which  Assump¬ 
tion  ^  5  will  hold  is  general  two-group  analysis  of  covariance  in  which 
the  covariates  are  equally  distributed  across  the  treatments,  as  might 
occur  with  random  or  blocked  allocation. 

It  should  also  be  noted  that  the  CT  test  will  be  of  the  correct 
level  even  when  (w^)  and  y are  vectors  rather  than  scalars,  as  in 
general  factorial  designs  or  many-group  analysis  of  covariance.  The 
requirement  remains  that  {w^}  should  be  unrelated  to  {v^}  « 


* 

We  have  shown  that  the  rather  naive  test  CT  ,  which  picks  a  scale  X 
and  then  performs  the  usual  F-tost ,  has  the  correct  asymptotic  level  in 
many  important  statistical  problems.  Generally  speaking,  this  level 
obtains  in  balanced  designs.  When  the  test  CT  has  the  correct  asymptotic 
level,  it  generally  outperforms  Wald*s  test  and  the  likelihood  ratio  test. 


6.  Proof  of  the  Theorem 

Write  A  ■  (x^  .  .  .  )  .  By  Assumption  4-  5  wc  have 

,  ,  -I  •  V  I  l**  °  ^ 

(a)  N  A  A  - - >  ^  j  , 

the  convergence  being  with  probability  one  in  case  of  randomness.  By  (8) 

and  Assumptions  ^  2  ~  ^4,  N  /°*  is  asymptotically  normally 

distributed  with  mean  zero  sind  variance  one,  so  it  suffices  to  show 

that  under  H  :  y„  -  0  , 
o  2 

(9)  N,/2(y2*  -  y0)/a  -p-  >  0  . 


Because  of  (8),  (9)  will  follow  by  proving 
N 

l 

i 

By  a  Taylor  expansion,  the  left  hand  side  of  (10)  becomes 


(10)  N~'^2  l  w.  (yfX  Yp'^/o  jl->  0 


»,/2(^)  f  V,  h  'P  °  • 


JL 

so  that  by  Assumption  *  1  we  need  only  show 


10 
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